GRAPE for fast and scalable graph processing and random-walk-based embedding

نویسندگان

چکیده

Abstract Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions edges are beyond the capabilities current software implementations. We present GRAPE (Graph Representation Learning, Prediction Evaluation), a resource graph processing embedding that is able to scale with big using specialized smart data structures, algorithms, fast parallel implementation random-walk-based methods. Compared state-of-the-art resources, shows an improvement orders magnitude empirical space time complexity, as well competitive edge- node-label prediction performance. comprises approximately 1.7 million well-documented lines Python Rust code provides 69 node-embedding methods, 25 inference models, collection efficient graph-processing utilities, over 80,000 from literature other sources. Standardized interfaces allow seamless integration third-party libraries, while ready-to-use modular pipelines permit easy-to-use evaluation graph-representation-learning therefore also positioning performs fair comparison between libraries embedding.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Random Walk Graph Kernel

Random walk graph kernel has been used as an important tool for various data mining tasks including classification and similarity computation. Despite its usefulness, however, it suffers from the expensive computational cost which is at least O(n) or O(m) for graphs with n nodes and m edges. In this paper, we propose Ark, a set of fast algorithms for random walk graph kernel computation. Ark is...

متن کامل

Graph Embedding through Random Walk for Shortest Paths Problems

We present a new probabilistic technique of embedding graphs in Z, the d-dimensional integer lattice, in order to find the shortest paths and shortest distances between pairs of nodes. In our method the nodes of a breath first search (BFS) tree, starting at a particular node, are labeled as the sites found by a branching random walk on Z. After describing a greedy algorithm for routing (distanc...

متن کامل

Community aware random walk for network embedding

Social network analysis provides meaningful information about behavior of network members that can be used for diverse applications such as classification, link prediction. However, network analysis is computationally expensive because of feature learning for different applications. In recent years, many researches have focused on feature learning methods in social networks. Network embedding r...

متن کامل

Bilingual Data Cleaning for SMT using Graph-based Random Walk

The quality of bilingual data is a key factor in Statistical Machine Translation (SMT). Low-quality bilingual data tends to produce incorrect translation knowledge and also degrades translation modeling performance. Previous work often used supervised learning methods to filter lowquality data, but a fair amount of human labeled examples are needed which are not easy to obtain. To reduce the re...

متن کامل

Scalable Methods for Random Walk with Restart and Tensor Factorization

Big data” has received considerable interests from both academia and industry in the last decade. It turned out that mining large-scale data enables us to obtain machine learning models with higher accuracy and extend our knowledge about large complex systems such as Web and social media. However, the enormous volume of data prevents us from simply using previous machine learning or data mining...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Nature Computational Science

سال: 2023

ISSN: ['2662-8457']

DOI: https://doi.org/10.1038/s43588-023-00465-8